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ABSTRACT 

A modified version of stocliastic tunneling, a recently introduced global optimiza- 
tion technique, is introduced as a new generalized-ensemble technique and tested for a 
benchmark peptide, Met-enkephalin. It is demonstrated that the new technique allows to 
evaluate folding properties and especially the glass temperature Tg of this peptide. 
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Numerical simulations of biological molecules can be extremely difficult when the 
molecule is described by "realistic" energy functions where interactions between all atoms 
are taken into account. For a large class of molecules, in particular for peptides or proteins, 
the various competing interactions lead to frustration and a rough energy landscape. At 
low temperatures canonical simulations will get trapped in one of the multitude of local 
minima separated by high energy barriers and physical quantities cannot be calculated ac- 
curately. One way to overcome this difficulty in protein simulations is by utilizing so-called 
generalized ensembles^, which are based on a non-Boltzmann probability distribution. 
Multicanonical sampling and simulated tempering are prominent examples of such 
an approach. Application of these techniques to the protein folding problem was ffist 
addressed in Ref. and their usefulness for simulation of biological molecules and other 
complex systems has become increasingly recognized. 

However, generalized-ensemble methods are not without problems. In contrast to 
canonical simulations the weight factors are not a priori known. Hence, for a computer 
experiment one needs estimators of the weights, and the problem of finding good estima- 
tors is often limiting the use of generalized-ensemble techniques. Here we describe and 
test a new generalized ensemble where determination of the weights is by construction of 
the ensemble simple and straightforward. Our method is based on a recently introduced 
global optimization technique, stochastic tunneling 0. 

Canonical simulations of proteins at low temperature are hampered by the roughness 
of the potential energy surface: local minima are separated by high energy barriers. To 
enhance sampling we propose to weight conformations not with the Boltzmann factor 
wb{E) = exp{—E/kBT), but with a weight 

Wf{E) = exp{f{E)/kBT) . (1) 

Here, T is a low temperature, ks the Boltzmann constant, and f{E) is a non-linear 
transformation of the potential energy onto the interval [0, —1] chosen such that the 
relative location of all minima is preserved. The physical idea behind such an approach is 
to allow the system at a given low temperature T to ^HunneT through energy barriers of 
arbitrary height, while the low energy region is still well resolved. A transformation with 
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the above characteristics can be reahzed by 

fi{E) = -e-(^-^o)/"^ . (2) 

Here, Eq is an estimate for the ground state and np the number of degrees of freedom of the 
system. Eq. ^ is a special choice of the transformation recently introduced under the name 
"stochastic tunneling" 0] to the corresponding problem of global minimization in complex 
potential energy landscapes. One can easily find further examples for transformations with 
the above stated properties, for instance, 

f2{E) = -{l + {E-Eo)/nFr' ■ (3) 

We will restrict our investigation to these two transformations without claiming that they 
are an optimal choice. 

A simulation in the above ensemble, defined by the weight of Eq. |T] with a suitable 
chosen non-linear transformation f{E), will sample a broad range of energies. Hence, 
application of re-weighting technique allows to calculate the expectation value of any 
physical quantity O over a large range of temperatures T by 

/ dE 0{E)Pf{E)w^\E)e-'''^^'^ 

<0>T =^—. . (4) 

/ dE Pf{E)wf{E)e-'''''^^ 

In this point our method is similar to other generalized-ensemble techniques such as the 
multicanonical sampling [0], however, our method differs from them in that the weights are 
explicitly given by Eq. One only needs to find an estimator for the ground-state energy 
Eq in the transforming functions fi{E) or f2{E) (see Eqs. ^ and which in earlier work 



IT] , was found to be much easier than the determination of weights for multicanonical 
algorithm ||^ or simulated tempering Q]. 

The new simulation technique was tested for Met-enkephalin, one of the simplest pep- 
tides, which has become a often used model to examine new algorithms. Met-enkephalin 
has the amino- acid sequence Tyr-Gly-Gly-Phe-Met. The potential energy function Etot 
that was used is given by the sum of the electrostatic term E^g, 12-6 Lennard- Jones term 
Evdw, and hydrogen-bond term Ehb for all pairs of atoms in the peptide together with 
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the torsion term Etors for all torsion angles: 

Etot = Ef,s + E^dW + -^/ife + Etors-, (5) 

B„ = (6) 

^hh = J2 ( ZtI - ) ' (8) 

^iors = Y.Ui{l±cos{niXi)), (9) 

where Vij is the distance between the atoms i and j, and Xi is the /-th torsion angle. The 

parameters {qi, Aij, Bij,Cij, Dij,Ui and rii) for the energy function were adopted from 
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The computer code SMCQ was used. The simulations were started from completely 
random initial conformations (Hot Start) and one Monte Carlo sweep updates every tor- 
sion angle of the peptide once. The peptide bond angles u were fixed to their common 
value 180°, which left 19 torsion angles (0, ip, and x) independent degrees of freedom 
(i.e., Up = 19). The interaction of the peptide with the solvent was neglected in the simu- 
lations and the dielectric constant e set equal to 2. In short preliminary runs it was found 
that T = 8 K was the optimal temperatures for simulations relying on the transformation 
fi{E) (Eq. and T = 6K for simulations relying on the second chosen transformation 
f2{E) (Eq. The free parameter Eq was set in Eq. |^ or (|]) to Eq = —10.72 kcal/mol, 
the ground state energy as known from previous work. In addition, simulations were also 
performed where Eq was dynamically updated in the course of the simulation and set to 
the lowest ever encountered energy. In these runs the (known) ground state was found 
in less than 5000 MC sweeps. Hence, determination of the weights is easier than in other 
generalized-ensemble techniques since in earlier workQ] it was found that at least 40,000 
sweeps were needed to calculate multicanonical weights. We remark that a Monte Carlo 
sweep in both algorithm takes approximately the same amount of CPU time. 

All thermodynamic quantities were then calculated from a single production run of 

1,000,000 MC sweeps which followed 10,000 sweeps for thermalization. At the end of 
^The program SMC was written by Dr. Frank Eiscnmenger (eisenmenger@rz.hu-bcrlin.de) 



every sweep we stored the energies of the conformation and the radius of gyration 



for further analyses. 

In order to demonstrate the dynamical behavior of the algorithm the "time series" and 
histograms of potential energy are shown for both choices of the transforming functions 
fi{E) (Fig. 1) and f2{E) (Fig. 2). Both choices of the non-linear transformation with 
which the energy landscape was deformed in the simulations lead to qualitatively the 
same picture. In Fig. la and Fig. 2a, respectively, one can see that the whole energy range 
between E < —10 kcal/mol (the ground state region) and E ^ 20 kcal/mol (high-energy, 
coil states) is sampled. However, unlike in the multicanonical algorithm the energies are 
not sampled uniformly and low-energy states appear with higher frequency than high 
energy states. However, as one can see from the logarithmic scale of Fig. lb and 2b where 
the histograms are displayed for these simulations, high-energy states are only suppressed 
by three orders of magnitude and their probability is still large enough to allow crossing of 
energy barriers. Hence large parts of the configuration space are sampled by our method 
and it is justified to calculate from these simulations thermodynamic quantities by means 
of re-weighting, see Eq. |[ 

Here, the average radius of gyration < R>, which is is a measure for the compactness 
of protein configurations and defined in Eq. [T^, was calculate for various temperatures. 
In Fig. 3 the results for the new ensemble, using the defining non-linear transformations 
fi{E) or f2{E), are compared with the ones of a multicanonical run with equal number 
of Monte Carlo sweeps. As one can see, the values of < i? > (T) agree for all three 
simulations over the whole temperature range. Hence, it is obvious that simulations in 
the new ensemble are indeed well able to calculate thermodynamic averages over a wide 
temperature range. 

After having established the new techniques as a possible alternative to other generalized- 
ensemble techniques such as multicanonical sampling or simulated tempering, its useful- 
ness shall be further demonstrated by calculating the free energy of Met-enkephalin as a 
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function of R: 

G{R) = -keT log P{R) (11) 

where 

P{R) = Pf{R) * w7i(E(i?))e-^(^)/'^«^ . (12) 

Here, a normalization is chosen where the minimal value of Gmin{R) = 0. The chosen 
temperature was T = 230K, which was found in earlier work as the folding temperature 
Tf of Met-enkephalin. The results, which rely on the transformation fi{E) of the energy 
landscape given by Eq. ^ are displayed in Fig. 4. At this temperature one observes 
clearly a "funnel" towards low values of R which correspond to compact structures. Such 
a funnel-like landscape was already observed in Ref. for Met-enkephalin, utilizing a 
different set of order parameters, and is predicted by the landscape theory of folding ||T4 |. 



The essence of the funnel landscape idea is competition between the tendency towards 
the folded state and trapping due to ruggedness of the landscape. One way to measure 
this competition is by the ratio [0: 



Q = ^i, (13) 
'E^ - E^ 



where the bar denotes averaging over compact configurations. The landscape theory 
asserts that good folding protein sequences are characterized by large values of Q ||15|| . 
Using the results of our simulations and defining a compact structure as one where R{i) < 



23A, we find E - Eq = 13.96(3) Kcal/mol, E^ - E^ = 0.49(2), from which we estimate 
for the above ratio Q = 20.0(5). This value indicates that Met-enkephalin is good folder 
and is consistent with earlier work [0 where we evaluated an alternative characterization 
of folding properties. Thirumalai and collaborators [|I^ have conjectured that the kinetic 
accessibility of the native conformation can be classified by the parameter 

<^ = ^, (14) 

-i-e 

i.e., the smaller a is, the more easily a protein can fold. Here Tf is the folding temperature 
and To the collapse temperature. With values for Tg = 295 K and Tf = 230 K, as measured 
in Ref. 0], one has for Met-enkephalin a ~ 0.2, indicating again that the peptide has good 
folding properties. 
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Yet another characterization of folding properties rehes on knowledge of the glass tem- 
perature Tg and is closely related to Eq. |1^. As the number of available states gets reduced 
with the decrease of temperature, the possibility of local trapping increases substantially. 
Glassy behavior appears when the residence time in some local traps becomes of the order 
of the folding event. Folding dynamics is now non-exponential since different traps have 



different escape times ^7j. For temperatures above the glass transition temperature Tg, 
the folding dynamics is exponential and a configurational diffusion coefficient average the 
effects of the short lived traps [jl8[. It is expected that for a good folder the glass transi- 
tion temperature, Tg, where glass behavior sets in, has to be significantly lower than the 
folding temperature Tf, i.e. a good folder can be characterized by the relation [|19| 



^ > 1 • (15) 

I present here for the first time a numerical estimate of this glass transition temperature for 
the peptide Met-enkephalin. The calculation of the estimate is based on the approximation 
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where the bar indicates again averaging over compact structures and 5*0 is the entropy of 
these states estimated by the relation 



\ogw{i) 

Sq = — logz-C (17) 

Here, z = compact'^ if) and C chosen such that the entropy of the ground state becomes 
zero. The results of the simulation in the new ensemble defined by the transformation 
fi{E), leads to a value of Sq = 2.3(7). Together with the above quoted value for E"^ — 
E"^ = 0.49(2) (in (Kcal/mol)^) one therefore finds as an estimate for the glass transition 
temperature 

Tg = 160(30) K . (18) 

Since it was found in earlier work that Tf = 230(30), it is obvious that the ratio 
Tf/Tg > 1 and again one finds that Met-enkephalin has good folding properties. Hence, 
we see that there is a strong correlation between all three folding criteria. 
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Let me summarize my results. I have proposed to utilize a recently introduced global 
optimization technique, stochastic tunneling, in such a way that it allows calculation 
of thermodynamic quantities. The new generalized-ensemble technique was tested for a 
benchmark peptide, Met-enkephalin. It was demonstrated that the new technique allows 
to evaluate the folding properties of this peptide and an estimate for the glass transition 
temperature Tg in that system was presented. Currently I am evaluating the efficiency of 
the new method in simulations of larger molecules. 
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FIGURE CAPTIONS: 

1. "Time series" (a) of potential energy E of Met-enkephalin for a simulation in a gener- 
alized ensemble defined by the transformation fi{E) of Eq. ^ and the corresponding 
histogram (b) of potential energy. 

2. "Time series" (a) of potential energy E of Met-enkephalin for a simulation in a 
generalized ensemble defined by the transformation f2{E) of Eq. ^ (a) and the 
corresponding histogram (b) of potential energy. 

3. Average radius of gyration < i? > (in A^) as a function of temperature (in K). 
The results of a multicanonical simulation of 1,000,000 MC sweeps were compare 
with simulations of equal statistics in the new ensemble utilizing either the no-linear 
transformation fi{E) or f2{E). 

4. Free energy G{R) as a function of the radius of gyration R for T = 230 K. The results 
rely on a generalized-ensemble simulation based on the transformation fi{E) of the 
energy landscape s defined in Eq. ^ 
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